A Preprocessing Framework and Approach for Web Applications
نویسندگان
چکیده
In the process of abstract extraction, two heuristics are used. First, the content of a page is organized semantically by paragraphs. Second, readers can quickly catch the ideas of the page with only a small quantity of sentences, and the sentences are generally located by key-words. Based on these heuristics, we can acquire an algorithm that will simulate the browsing manner of a reader to extract the abstract using the keywords obtained in the previous phase. The algorithm first divides the topic content into several paragraphs according to the tag tree, and then the sentences with high weights are picked from each paragraph to make up the abstract of the page. Following, two important subprocesses are illustrated in detail. 1. Identify the paragraphs of the content Fortunately, the structure of container tags describes the layout of HTML pages, which makes it possible to identify the paragraphs of the content. In the topic content tree, first locate the node that is the lowest common ancestor of all leaf nodes (we call it topic root). Topic root corresponds with the tag that exactly embeds all topic content nodes. Then, the son nodes of the topic root correspond with the paragraphs of the topic content. Figure 3 shows the process of identifying paragraphs. In (3) of Figure 3, each p block is a paragraph.
منابع مشابه
Design and Evaluation of a Method for Partitioning and Offloading Web-based Applications in Mobile Systems with Bandwidth Constraints
Computation offloading is known to be among the effective solutions of running heavy applications on smart mobile devices. However, irregular changes of a mobile data rate have direct impacts on code partitioning when offloading is in progress. It is believed that once a rate-adaptive partitioning performed, the replication of such substantial processes due to bandwidth fluctuation can be avoid...
متن کاملبهینهسازی اجرا و پاسخ صفحات وب در فضای ابری با روشهای پیشپردازش، مطالعه موردی سامانههای وارنیش و انجینکس
The response speed of Web pages is one of the necessities of information technology. In recent years, renowned companies such as Google and computer scientists focused on speeding up the web. Achievements such as Google Pagespeed, Nginx and varnish are the result of these researches. In Customer to Customer(C2C) business systems, such as chat systems, and in Business to Customer(B2C) systems, s...
متن کاملA density based clustering approach to distinguish between web robot and human requests to a web server
Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data ...
متن کاملAn Efficient Framework for Accurate Arterial Input Selection in DSC-MRI of Glioma Brain Tumors
Introduction: Automatic arterial input function (AIF) selection has an essential role in quantification of cerebral perfusion parameters. The purpose of this study is to develop an optimal automatic method for AIF determination in dynamic susceptibility contrast magnetic resonance imaging (DSC-MRI) of glioma brain tumors by using a new preprocessing method.Material and Methods: For this study, ...
متن کاملطبقهبندی کاربردی کارکردهای عوامل نرمافزاری هوشمند و تطبیق آنها با ویژگیهای وبسایتهای کتابخانههای دیجیتال
Purpose: Web services are presently considered as technologies with highest number of applications for the purpose of providing the automatic, high-quality, and fast information interactions. The aim of this paper is therefore to provide a comprehensive framework for a collection of significant services offered by Farsi websites in libraries to be used in future designs. It also aims to classif...
متن کاملAn Algorithmic Approach to Data Preprocessing in Web Usage Mining
Web usage Mining is an area of web mining which deals with the extraction of interesting knowledge from logging information produced by web server. Different data mining techniques can be applied on web usage data to extract user access patterns and this knowledge can be used in variety of applications such as system improvement, web site modification, business intelligence etc. Web usage minin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Web Eng.
دوره 2 شماره
صفحات -
تاریخ انتشار 2004